Search CORE

115 research outputs found

Specification and Simulation of Statistical Query Algorithms for Efficiency and Noise Tolerance

Author: Aslam Javed A
Decatur Scott E
Publication venue: Academic Press.
Publication date: 30/04/1998
Field of study

AbstractA recent innovation in computational learning theory is the statistical query (SQ) model. The advantage of specifying learning algorithms in this model is that SQ algorithms can be simulated in the probably approximately correct (PAC) model, both in the absenceandin the presence of noise. However, simulations of SQ algorithms in the PAC model have non-optimal time and sample complexities. In this paper, we introduce a new method for specifying statistical query algorithms based on a type ofrelative errorand provide simulations in the noise-free and noise-tolerant PAC models which yield more efficient algorithms. Requests for estimates of statistics in this new model take the following form: “Return an estimate of the statistic within a 1±μfactor, or return ⊥, promising that the statistic is less thanθ.” In addition to showing that this is a very natural language for specifying learning algorithms, we also show that this new specification is polynomially equivalent to standard SQ, and thus, known learnability and hardness results for statistical query learning are preserved. We then give highly efficient PAC simulations of relative error SQ algorithms. We show that the learning algorithms obtained by simulating efficient relative error SQ algorithms both in the absence of noise and in the presence of malicious noise have roughly optimal sample complexity. We also show that the simulation of efficient relative error SQ algorithms in the presence of classification noise yields learning algorithms at least as efficient as those obtained through standard methods, and in some cases improved, roughly optimal results are achieved. The sample complexities for all of these simulations are based on thedνmetric, which is a type of relative error metric useful for quantities which are small or even zero. We show that uniform convergence with respect to thedνmetric yields “uniform convergence” with respect to (μ, θ) accuracy. Finally, while we show that manyspecificlearning algorithms can be written as highly efficient relative error SQ algorithms, we also show, in fact, thatallSQ algorithms can be written efficiently by proving general upper bounds on the complexity of (μ, θ) queries as a function of the accuracy parameterε. As a consequence of this result, we give general upper bounds on the complexity of learning algorithms achieved through the use of relative error SQ algorithms and the simulations described above

Elsevier - Publisher Connector

General Bounds on Statistical Query Learning and PAC Learning with Noise via Hypothesis Boosting

Author: Aslam Javed A.
Decatur Scott E.
Publication venue: Academic Press.
Publication date: 15/03/1998
Field of study

AbstractWe derive general bounds on the complexity of learning in the statistical query (SQ) model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the SQ model. This new model was introduced by Kearns to provide a general framework for efficient PAC learning in the presence of classification noise. We first show a general scheme for boosting the accuracy of weak SQ learning algorithms, proving that weak SQ learning is equivalent to strong SQ learning. The boosting is efficient and is used to show our main result of the first general upper bounds on the complexity of strong SQ learning. Since all SQ algorithms can be simulated in the PAC model with classification noise, we also obtain general upper bounds on learning in the presence of classification noise for classes which can be learned in the SQ model

Elsevier - Publisher Connector

Automatic Ground Truth Expansion for Timeline Evaluation

Author: Aslam Javed
Aslam Javed
Aslam Javed A
Dang Hoa Trang
Hoa Dang
Kedzie Chris
Lin Chin-Yew
Lin Jimmy
Nenkova Ani
Paul Over
Voorhees Ellen M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

The development of automatic systems that can produce timeline summaries by filtering high-volume streams of text documents, retaining only those that are relevant to a particular information need (e.g. topic or event), remains a very challenging task. To advance the field of automatic timeline generation, robust and reproducible evaluation methodologies are needed. To this end, several evaluation metrics and labeling methodologies have recently been developed - focusing on information nugget or cluster-based ground truth representations, respectively. These methodologies rely on human assessors manually mapping timeline items (e.g. tweets) to an explicit representation of what information a 'good' summary should contain. However, while these evaluation methodologies produce reusable ground truth labels, prior works have reported cases where such labels fail to accurately estimate the performance of new timeline generation systems due to label incompleteness. In this paper, we first quantify the extent to which timeline summary ground truth labels fail to generalize to new summarization systems, then we propose and evaluate new automatic solutions to this issue. In particular, using a depooling methodology over 21 systems and across three high-volume datasets, we quantify the degree of system ranking error caused by excluding those systems when labeling. We show that when considering lower-effectiveness systems, the test collections are robust (the likelihood of systems being miss-ranked is low). However, we show that the risk of systems being miss-ranked increases as the effectiveness of systems held-out from the pool increases. To reduce the risk of miss-ranking systems, we also propose two different automatic ground truth label expansion techniques. Our results show that our proposed expansion techniques can be effective for increasing the robustness of the TREC-TS test collections, markedly reducing the number of miss-rankings by up to 50% on average among the scenarios tested

Crossref

On Estimating the Size and Confidence of a Statistical Audit

Author: Aslam Javed A.
Popa Raluca A.
Rivest Ronald L.
Publication venue: Caltech/MIT Voting Technology Project
Publication date: 01/01/2007
Field of study

We consider the problem of statistical sampling for auditing elections, and we develop a remarkably simple and easily-calculated upper bound for the sample size necessary for determining with probability at least c whether a given set of n objects contains b or more “bad” objects. While the size of the optimal sample drawn without replacement can be determined with a computer program, our goal is to derive a highly accurate and simple formula that can be used by election officials equipped with only a simple calculator

CiteSeerX

DSpace@MIT

Bayes Optimal Metasearch: A Probabilistic Model for Combining the Results of Multiple Retrieval Systems

Author: Aslam Javed A
Montague Mark
Publication venue: Dartmouth Digital Commons
Publication date: 01/12/2000
Field of study

We introduce a new, probabilistic model for combining the outputs of an arbitrary number of query retrieval systems. By gathering simple statistics on the average performance of a given set of query retrieval systems, we construct a Bayes optimal mechanism for combining the outputs of these systems. Our construction yields a metasearch strategy whose empirical performance nearly always exceeds the performance of any of the constituent systems. Our construction is also robust in the sense that if ``good\u27\u27 and ``bad\u27\u27 systems are combined, the performance of the composite is still on par with, or exceeds, that of the best constituent system. Finally, our model and theory provide theoretical and empirical avenues for the improvement of this metasearch strategy

Dartmouth Digital Commons (Dartmouth College)

Recommended from our members

Improved Noise-Tolerant Learning and Generalized Statistical Queries

Author: Aslam Javed A.
Decatur Scott E.
Publication venue
Publication date: 04/03/2016
Field of study

The statistical query learning model can be viewed as a tool for creating (or demonstrating the existence of ) noise-tolerant learning algorithms in the PAC model. The complexity of a statistical query algorithm, in conjunction with the complexity of simulating SQ algorithms in the PAC model with noise, determine the complexity of the noise-tolerant PAC algorithms produced. Although roughly optimal upper bounds have been shown for the complexity of statistical query learning, the corresponding noise-tolerant PAC algorithms are not optimal due to inefficient simulations. In this paper we provide both improved simulations and a new variant of the statistical query model in order to overcome these inefficiencies. We improve the time complexity of the classification noise simulation of statistical query algorithms. Our new simulation has a roughly optimal dependence on the noise rate. We also derive a simpler proof that statistical queries can be simulated in the presence of classification noise. This proof makes fewer assumptions on the queries themselves and therefore allows one to simulate more general types of queries. We also define a new variant of the statistical query model based on relative error, and we show that this variant is more natural and strictly more powerful than the standard additive error model. We demonstrate efficient PAC simulations for algorithms in this new model and give general upper bounds on both learning with relative error statistical queries and PAC simulation. We show that any statistical query algorithm can be simulated in the PAC model with malicious errors in such a way that the resultant PAC algorithm has a roughly optimal tolerable malicious error rate and sample complexity. Finally, we generalize the types of queries allowed in the statistical query model. We discuss the advantages of allowing these generalized queries and show that our results on improved simulations also hold for these queries.Engineering and Applied Science

Harvard University - DASH

Efficiency of Treated Domestic Wastewater to Irrigate Two Rice Cultivars, PK 386 and Basmati 515, under a Hydroponic Culture System

Author: Aslam Tahira
Campos Luiza C
Javed Muhammad Arshad
Mirza Safdar A
Rashid Aneeba
Publication venue: 'MDPI AG'
Publication date: 01/01/2023
Field of study

The increasing human population continues to exert pressure on the freshwater scarcity. The availability of freshwater for crop irrigation has become challenging. The present study aimed to use domestic wastewater (DWW) for the irrigation of two rice cultivars (CVs) after treatment with the bacterial strain Alcaligenes faecalis MT477813 under a hydroponic culture system. The first part of this study focused on the bioremediation and analysis of the physicochemical parameters of DWW to compare pollutants before and after treatment. The biotreatment of DWW with the bacterial isolate showed more than 90% decolourisation, along with a reduction in contaminants. The next part of the study evaluated the impacts of treated and untreated DWW on the growth of two rice cultivars, i.e., PK 386 and Basmati 515, under a hydroponic culture system which provided nutrients and water to plants with equal and higher yields compared to soil. Growth parameters such as the shoot and root length and the wet and dry weights of the rice plants grown in the treated DWW were considerably higher than those for the plants grown in untreated DWW. Therefore, enhanced growth of both rice cultivars grown in biotreated DWW was observed. These results demonstrate the bioremediation efficiency of the bacterial isolate and the utility of the DWW for rice crop irrigation subsequent to biotreatment

UCL Discovery

Disposition Kinetics and Optimal Dosage of Ciprofloxacin in Healthy Domestic Ruminant Species

Author: Bilal Aslam
Faqir Muhammad
Ijaz Javed
Javed I. Sultan
M. Zargham Khan
Mansoor A. Sandhu
None Zia-Ur-Rahman
Zahid Iqbal
Publication venue: 'University of Veterinary and Pharmaceutical Sciences'
Publication date: 01/01/2009
Field of study

The purpose of this experimental study was to determine the disposition kinetics and optimal dosages of ciprofloxacin in healthy domestic ruminant species including adult female buffalo, cow, sheep and goat. The drug was given as a single intramuscular dose of 5 mg/kg. The plasma concentrations of the drug were determined with HPLC and pharmacokinetic variables were determined. The biological half-life (t1/2 β was longer in cows (3.25 ± 0.46 h) followed by intermediate values in buffaloes (3.05 ± 0.20 h) and sheep (2.93 ± 0.45 h) and shorter in goats (2.62 ± 0.39 h). The volume of distribution (Vd) in buffaloes was 1.09 ± 0.06 l/kg, cows 1.24 ± 0.16 l/kg, sheep 2.89 ± 0.30 l/kg and goats 3.76 ± 0.92 l/kg. Total body clearance (ClB) expressed in l/h/kg was minimum in buffaloes 0.25 ± 0.02 followed by values in cows 0.31 ± 0.02 and sheep 0.75 ± 0.04 and maximum in goats 1.09 ± 0.11. An optimal dosage regimen for 12-h interval consisted of 5.17, 5.62, 6.54 and 6.10 mg/kg body weight as priming and 4.84, 5.37, 6.26 and 5.91 mg/kg body weight as maintenance intramuscular dose in buffalo, cow, sheep and goat, respectively. The manufacturers of ciprofloxacin have claimed 5 mg/kg dose to be repeated after 24 h. However, the investigated dosage regimen may be repeated after 12 h to maintain MIC at the end of the dosage interval. Therefore, it is imperative that an optimal dosage regimen be based on the disposition kinetics data determined in the species and environment in which a drug is to be employed clinically

Crossref

Directory of Open Access Journals

Clinical practice guidelines on the management of variceal bleeding

Author: Akram Mohammad
Alam Altaf
Butt Javed Aslam
Butt Nazish
Farooqi Javed Iqbal
Khan Anwaar A
Khattak Abbas Khan
Khokhar Nasir
Malik Kashif
Mirza Shakeel Ahmad
Nawaz Arif Amir
Salamat Amjad
Salih Muhammed
Shah Hasnain Ali
Sher Rehman Sher
Tayyab Ghias-un-Nabi
Publication venue: eCommons@AKU
Publication date: 01/01/2016
Field of study

Gastroesophageal variceal bleeding occurs in 30 - 50% of patients of liver cirrhosis with portal hypertension, with 20-70% mortality in one year. Therefore, it is essential to screen these patients for varices and prevent first episode of bleeding by treating them with β-blockers or endoscopic variceal band ligation. Ideally, the patients with variceal bleeding should be treated in a unit where the personnel are familiar with the management of such patients and where routine therapeutic interventions can be undertaken. Proper management of such patients include: initial assessment, resuscitation, blood volume replacement, vasoactive agents, prevention of associated complications such as bacterial infections, hepatic encephalopathy, coagulopathy and thrombocytopenia, and specific therapy. Rebleeding occurs in about 60% patients within 2 years of their recovery from first variceal bleeding episode, with 33% mortality. Therefore, it is mandatory that all such patients must be started on combination of β-blockers and band ligation to prevent recurrence of bleeding. Patients who required shunt surgery/TIPSS to control the acute episode do not require further preventive measures. These clinical practice guidelines (CPGs) have been jointly developed by Pakistan Society of Hepatology (PSH) and Pakistan Society of Study of Liver Diseases (PSSLD)

eCommons@AKU